Empar: EM-based algorithm for parameter estimation of Markov models on trees

نویسندگان

  • Ania Kedzierska
  • Marta Casanellas
چکیده

The goal of branch length estimation in phylogenetic inference is to estimate the divergence time between a set of sequences based on compositional differences between them. A number of software is currently available facilitating branch lengths estimation for homogeneous and stationary evolutionary models. Homogeneity of the evolutionary process imposes fixed rates of evolution throughout the tree. In complex data problems this assumption is likely to put the results of the analyses in question. In this work we propose an algorithm for parameter and branch lengths inference in the discrete-time Markov processes on trees. This broad class of nonhomogeneous models comprises the general Markov model and all its submodels, including both stationary and nonstationary models. Here, we adapted the well-known Expectation-Maximization algorithm and present a detailed performance study of this approach for a selection of nonhomogeneous evolutionary models. We conducted an extensive performance assessment on multiple sequence alignments simulated under a variety of settings. We demonstrated high accuracy of the tool in parameter estimation and branch lengths recovery, proving the method to be a valuable tool for phylogenetic inference in real life problems. Empar is an open-source C++ implementation of the methods introduced in this paper and is the first tool designed to handle nonhomogeneous data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models

We describe the maximum-likelihood parameter estimation problem and how the ExpectationMaximization (EM) algorithm can be used for its solution. We first describe the abstract form of the EM algorithm as it is often given in the literature. We then develop the EM parameter estimation procedure for two applications: 1) finding the parameters of a mixture of Gaussian densities, and 2) finding the...

متن کامل

Time discretization of continuous-time filters and smoothers for HMM parameter estimation

In this paper we propose algorithms for parameter estimation of fast-sampled homogeneous Markov chains observed in white Gaussian noise. Our algorithms are obtained by the robust discretization of stochastic differential equations involved in the estimation of continuous-time Hidden Markov Models (HMM’s) via the EM algorithm. We present two algorithms: The first is based on the robust discretiz...

متن کامل

Noise Compensation by a Sequential Kullback Proximal Algorithm

We present sequential parameter estimation in the framework of the Hidden Markov Models. The sequential algorithm is a sequential Kullback proximal algorithm, which chooses the KullbackLiebler divergence as a penalty function for the maximum likelihood estimation. The scheme is implemented as £lters. In contrast to algorithms based on the sequential EM algorithm, the algorithm has faster conver...

متن کامل

An Adaptive Approach to Increase Accuracy of Forward Algorithm for Solving Evaluation Problems on Unstable Statistical Data Set

Nowadays, Hidden Markov models are extensively utilized for modeling stochastic processes. These models help researchers establish and implement the desired theoretical foundations using Markov algorithms such as Forward one. however, Using Stability hypothesis and the mean statistic for determining the values of Markov functions on unstable statistical data set has led to a significant reducti...

متن کامل

Estimating rate constants in hidden Markov models by the EM algorithm

The EM algorithm, e.g., the Baum–Welch re-estimation, is an important tool for parameter estimation in discrete-time hidden Markov models. We present a direct re-estimation of rate constants for applications in which the underlying Markov process is continuous in time. Previous estimation of discrete-time transition probabilities is not necessary.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012